Introduction (and take-home message)

The aim of this study is to characterize changes in the prevalence of resistant bacteria in a way that allows comparisons across bacteria, drugs and countries. For that, we have to acknowledge that the additional number of resistant strains in a given year compared to the previous one depends on:

  1. the rate at which resistant strains spread (i.e. fitness) and
  2. the number of resistant strains.

Indeed, a few resistant strains with a high spreading capacity can produce the just as many additional resistant strains as many resistant strains with a low spreading capacity. This implies that a proper comparisons of the dynamics of change of the number of resistant strains needs to account for the current number of resistant strains. Furthermore, because even bacterial populations cannot grow indefinitely in size, the expected number of new resistant strains also depend on

  1. the number of non-resistant strains.

Indeed, for a given value of resistance fitness and a given number of resistant strains, the additional number of resistant strains produced is lower in a case where all the bacteria are already resistant than in case where a subtantial proportion of the bacteria are still non-resistant and can be “replaced” by resistant strains.

From what precedes, the change in the proportion of resistant strains in drug sensitivity tests performed at two different points in time depends on

  1. the rate at which resistant strains spread (i.e. fitness) and
  2. the proportion of resistant strains.

We here proposed a simple model that allows to estimate the rate \(\rho\) from resistance proportion data.

Packages

Installing the required packages:

required <- c("dplyr", "magrittr", "purrr", "tidyr")
to_install <- setdiff(required, row.names(installed.packages()))
if (length(to_install)) install.packages(to_install)

Loading magrittr for interactive use:

library(magrittr)

Downloading the data from ECDC and loading into R

From Liselotte Diaz Högberg (), on 2019-02-18:

EARS-Net data are available for public download through our Surveillance Atlas, atlas.ecdc.europa.eu/public (choose Antimicrobial resistance). The atlas includes a data export option in the top right corner. This data are open for use as long as the source is acknowledged. As a researcher, you are also welcome to access EARS-Net data through a third party data access request. Please find more information here ecdc.europa.eu/en/publications-data/european-surveillance-system-tessy and do not hesitate to contact me for further details and clarifications.

We select All time periods, All regions, All indicators that we export to the ECDC_surveillance_data_Antimicrobial_resistance.csv file on 2019-10-03

ecdc <- "ECDC_surveillance_data_Antimicrobial_resistance.csv" %>% 
  read.csv() %>% 
  dplyr::filter(Unit == "N") %>% 
  dplyr::select(-HealthTopic, -Unit, -RegionCode, -TxtValue) %>% 
  tidyr::separate(Population, c("bacteria", "test"), "\\|") %>% 
  dplyr::filter(!grepl("Combined", test)) %>% 
  dplyr::mutate_if(is.factor, as.character) %>% 
  dplyr::mutate_at("NumValue", as.integer) %>% 
  tidyr::pivot_wider(names_from = Indicator, values_from = NumValue) %>% 
  dplyr::transmute(bacteria = bacteria,
                   test     = test,
                   year     = Time,
                   country  = RegionName,
                   N        = `Total tested isolates`,
                   R        = `Resistant (R) isolates`,
                   I        = `Non-susceptible (I and R) isolates` - R)

which gives:

ecdc
## # A tibble: 10,925 x 7
##    bacteria           test             year country            N     R     I
##    <chr>              <chr>           <int> <chr>          <int> <int> <int>
##  1 Acinetobacter spp. Aminoglycosides  2012 Austria            0     0     0
##  2 Acinetobacter spp. Aminoglycosides  2012 Belgium            0     0     0
##  3 Acinetobacter spp. Aminoglycosides  2012 Bulgaria          65    38     6
##  4 Acinetobacter spp. Aminoglycosides  2012 Cyprus            23    12     0
##  5 Acinetobacter spp. Aminoglycosides  2012 Czech Republic     0     0     0
##  6 Acinetobacter spp. Aminoglycosides  2012 Germany          119     7     0
##  7 Acinetobacter spp. Aminoglycosides  2012 Denmark           77     8     0
##  8 Acinetobacter spp. Aminoglycosides  2012 Estonia            0     0     0
##  9 Acinetobacter spp. Aminoglycosides  2012 Greece          1234   964   118
## 10 Acinetobacter spp. Aminoglycosides  2012 Spain              0     0     0
## # … with 10,915 more rows

where N is the total number of samples, R is the number of samples that are resistant and I is the number of samples for which the resistance level is intermediate.

Data overview

The ECDC data has data on 8 bacteria:

sort(unique(ecdc$bacteria))
## [1] "Acinetobacter spp."       "Enterococcus faecalis"    "Enterococcus faecium"     "Escherichia coli"        
## [5] "Klebsiella pneumoniae"    "Pseudomonas aeruginosa"   "Staphylococcus aureus"    "Streptococcus pneumoniae"

in 30 countries:

sort(unique(ecdc$country))
##  [1] "Austria"        "Belgium"        "Bulgaria"       "Croatia"        "Cyprus"         "Czech Republic" "Denmark"       
##  [8] "Estonia"        "Finland"        "France"         "Germany"        "Greece"         "Hungary"        "Iceland"       
## [15] "Ireland"        "Italy"          "Latvia"         "Lithuania"      "Luxembourg"     "Malta"          "Netherlands"   
## [22] "Norway"         "Poland"         "Portugal"       "Romania"        "Slovakia"       "Slovenia"       "Spain"         
## [29] "Sweden"         "United Kingdom"

over 18 years from 2000 to 2017. The tests of susceptibility that are performed are the following for the different bacteria:

with(ecdc, table(bacteria, test))
##                           test
## bacteria                   Aminoglycosides Aminopenicillins Carbapenems Ceftazidime Fluoroquinolones High-level gentamicin Macrolides
##   Acinetobacter spp.                   180                0         180           0              180                     0          0
##   Enterococcus faecalis                  0              511           0           0                0                   511          0
##   Enterococcus faecium                   0              511           0           0                0                   510          0
##   Escherichia coli                     512              512         510           0              512                     0          0
##   Klebsiella pneumoniae                386                0         386           0              386                     0          0
##   Pseudomonas aeruginosa               386                0         386         386              386                     0          0
##   Staphylococcus aureus                  0                0           0           0                0                     0          0
##   Streptococcus pneumoniae               0                0           0           0                0                     0        386
##                           test
## bacteria                   Meticillin (MRSA) Penicillins PiperacillinTazobactam Third-generation cephalosporins Vancomycin
##   Acinetobacter spp.                       0           0                      0                               0          0
##   Enterococcus faecalis                    0           0                      0                               0        511
##   Enterococcus faecium                     0           0                      0                               0        511
##   Escherichia coli                         0           0                      0                             512          0
##   Klebsiella pneumoniae                    0           0                      0                             386          0
##   Pseudomonas aeruginosa                   0           0                    386                               0          0
##   Staphylococcus aureus                  516           0                      0                               0          0
##   Streptococcus pneumoniae                 0         386                      0                               0          0

The number of biological tests performed in the countries naturally increases with time:

with(ecdc, table(year, country))
##       country
## year   Austria Belgium Bulgaria Croatia Cyprus Czech Republic Denmark Estonia Finland France Germany Greece Hungary Iceland Ireland
##   2000      12      12        1       0      0              1      12       4      12      0      12     12       0      12      12
##   2001      12      12       12      12      0             12      12      12      12     12      12     12      12      12      12
##   2002      12      12       12      12      0             12      12      12      12     12      12     12      12      12      12
##   2003      12      12       12      12     11             12      12      12      12     12      12     12      12      12      12
##   2004      12      12       12      12     12             12      12      12      12     12      12     12      12      12      12
##   2005      23      23       23      23     23             23      23      23      23     23      23     23      23      23      23
##   2006      23      23       23      23     23             23      23      23      23     23      23     23      23      23      23
##   2007      23      23       23      23     23             23      23      23      23     23      23     23      23      23      23
##   2008      23      23       23      23     23             23      23      23      23     23      23     23      23      23      23
##   2009      23      23       23       0     23             23      23      23      23     23      23     23      23      23      23
##   2010      23      23       23      23     23             23      23      23      23     23      23     23      23      23      23
##   2011      23      23       23      23     23             23      23      23      23     23      23     23      23      23      23
##   2012      26      26       26      26     26             26      26      26      26     26      26     26      26      26      26
##   2013      26      26       26      26     26             26      26      26      26     26      26     26      26      26      26
##   2014      26      26       26      26     26             26      26      26      26     26      26     26      26      26      26
##   2015      26      26       26      26     26             26      26      26      26     26      26     26      26      26      26
##   2016      26      26       26      26     26             26      26      26      26     26      26     26      26      26      26
##   2017      26      26       26      26     26             26      26      26      26     26      26     26      26      26      26
##       country
## year   Italy Latvia Lithuania Luxembourg Malta Netherlands Norway Poland Portugal Romania Slovakia Slovenia Spain Sweden United Kingdom
##   2000    12      0         0         12     1          12     12      0       12       0        0        1    12     12             12
##   2001    12      0         0         12    12          12     12     12       12       0       11       12    12     12             12
##   2002    12      0         0         12    12          12     12     12       12      12       12       12    12     12             12
##   2003    12      0         0         12    12          12     12     12       12      12       12       12    12     12             12
##   2004    12      1         0         12    12          12     12     12       12      12       12       12    12     12             12
##   2005    23     23         0         23    23          23     23     23       23      23       23       23    23     23             23
##   2006    23     23        23         23    23          23     23     23       23      23       23       23    23     23             23
##   2007    23     23        23         23    23          23     23     23       23      23       23       23    23     23             23
##   2008    23     23        23         23    23          23     23     23       23      23       23       23    23     23             23
##   2009    23     23        23         23    23          23     23     23       23      23        0       23    23     23             23
##   2010    23     23        23         23    23          23     23     23       23      23        0       23    23     23             23
##   2011    23     23        23         23    23          23     23     23       23      23       23       23    23     23             23
##   2012    26     26        26         26    26          26     26     26       26      26       26       26    26     26             26
##   2013    26     26        26         26    26          26     26     26       26      26       26       26    26     26             26
##   2014    26     26        26         26    26          26     26     26       26      26       26       26    26     26             26
##   2015    26     26        26         26    26          26     26     26       26      26       26       26    26     26             26
##   2016    26     26        26         26    26          26     26     26       26      26       26       26    26     26             26
##   2017    26     26        26         26    26          26     26     26       26      26       26       26    26     26             26

Modeling changes in resistance levels

A Poisson model

Let’s consider the following model between 2 successive years:

\[ R_{y+1} \sim \mbox{Pois}\left(\left[R_y + \rho R_y \left(N_y - \frac{R_y}{N_y}\right)\right]\frac{N_{y + 1}}{N_y}\right) \]

Where \(N_y\) and \(N_{y + 1}\) are the sample sizes at years \(y\) and \(y+1\), and \(R_y\) and \(R_{y + 1}\) are the number of resistant samples at years \(y\) and \(y+1\).

data <- ecdc %>% dplyr::filter(bacteria == "Escherichia coli",
                       test     == "Third-generation cephalosporins",
                       country  == "Netherlands") %>% 
  dplyr::arrange(year) %>% 
  dplyr::select(N, R) %>% 
  head(2) %>% 
  as.matrix()

The following function computes the \(\lambda\) parameter of the Poisson distribution, as a funtion of the growth rate \(\rho\) (rho) of the resistant strains compared to the susceptible strain (i.e. \(\rho < 0\) means that the resistant strain spreads less than the susceptible strain, \(\rho = 0\) means that the resistant and susceptible strains spread equally well and \(\rho > 0\) means that the resistant strain spreads more than the susceptible strain), \(N_0\) (N_0) and \(R_0\) (R_0) the number of samples and resistant samples at a given year, and \(N_1\) (N_1) the number of samples the year after:

lambda <- function(rho, R0, N0, N1) {
  (R0 + rho * R0 * (N0 - R0 / N0)) * N1 / N0
}

Let’s try it:

lambda(.005, 1, 977, 1446)
## [1] 8.710033

The following function uses the lambda() function to compute the minus log-likelihood of a given values of \(\rho\), given the observed values \(N_0\), \(R_0\), \(N_1\) and \(R_1\):

mll <- function(theta, data) {
  -dpois(data[2, 2], lambda(theta, data[1, 2], data[1, 1], data[2, 1]), log = TRUE)
}

Let’s try it:

unname(mll(.01, data))
## [1] 4.393962

The funtion mll() is used by the following function in order to estimate the value of \(\rho\), together with its confidence interval:

estimate <- function(data, mll, interval, alpha = .95) {
  if (data[1, 2] < 1) return(rep(NA, 3))
  if (data[2, 2] < 2) return(rep(NA, 3))
  if (data[1, 1] < 2) return(rep(NA, 3))
  opt <- function(...) optimise(..., tol = .Machine$double.eps)
  mle <- opt(mll, interval, data)
  c(mle$minimum,
    opt(function(theta, data) abs(mll(theta, data) - mle$objective - qchisq(alpha, 1) / 2), c(interval[1], mle$minimum), data)$minimum,
    opt(function(theta, data) abs(mll(theta, data) - mle$objective - qchisq(alpha, 1) / 2), c(mle$minimum, interval[2]), data)$minimum)
}

Let’s try it:

(mle <- estimate(data, mll, c(-.1, .1)))
## [1] 0.004508967 0.001504724 0.009274910

Let’s illustrate the estimation on a figure:

theta_val <- seq(.001, .01, le = 100)
plot(theta_val, mll(theta_val, data), type = "l", xlab = "parameter value",
     ylab = "minus log-likelihood", col = "blue")
abline(v = mle, col = "green", lty = 2)
abline(h = mll(mle, data), col = "green", lty = 2)

The following function chunks a 2-column data frame into a list of 2 x 2 data frames, sliding row-wise a window of 2 rows:

chunk <- function(df) {
  lapply(1:(nrow(df) - 1), function(x) as.matrix(df[x:(x + 1), ]))
}

We can put this chunk() function together with the estimate() function into the following pipeline that estimates \(\rho\) with confidence interval for all the years of a data frame:

estimate_all <- function(data, mll, interval) {
  require(magrittr)
  data %<>% dplyr::arrange(year)
  data %>% 
    dplyr::select(N, R) %>% 
    chunk() %>% 
    lapply(estimate, mll, interval) %>% 
    do.call(rbind, .) %>%
    as.data.frame() %>% 
    setNames(c("estimate", "lower", "upper")) %>% 
    dplyr::mutate(year = data$year[-length(data$year)] + .5) %>% 
    dplyr::select(year, dplyr::everything())
}

Let’s try it on the resistance to 3rd-generation cephalosporins in \(E. coli\) in the Netherlands and then plot the results:

ecdc %>% dplyr::filter(bacteria == "Escherichia coli",
                       test     == "Third-generation cephalosporins",
                       country  == "Netherlands") %>%
  estimate_all(mll, c(-.1, .1)) %>% 
  with({
    plot(year, estimate, xlab = NA, ylab = "resistance fitness",
         ylim = c(min(lower, na.rm = TRUE), max(upper, na.rm = TRUE)), col = "blue", pch = 19)
    arrows(year, lower, year, upper, .05, 90, 3, col = "blue")
    abline(h = 0, lty = 2)
})

Let’s put the above code into the following function:

plot_fitness <- function(df, bug, drug, country, interval, color = "#7570b3") {
  est <- df %>% dplyr::filter(bacteria == !! bug,
                       test     == !! drug,
                       country  == !! country) %>%
    estimate_all(mll, interval)
  if (all(is.na(est$estimate))) {
    plot(NA, ann = F, axes = F, xlim = 0:1, ylim = 0:1)
  } else {
    with(est, {
      plot(year, estimate, xlab = NA, xlim = range(df$year),
           ylim = c(min(lower, na.rm = TRUE), max(upper, na.rm = TRUE)), col = color, pch = 19, type = "o",
           lty = 2, axes = FALSE, ann = FALSE)
      arrows(year, lower, year, upper, .05, 90, 3, col = color)
    })
    abline(h = 0, lty = 2)
    axis(4, col = color, col.axis = color)
    mtext("resistance fitness", 4, 1.5, cex = 2 / 3, col = color)
  }
}

Let’s combine the value of resistance prevalence and estimated resistance fitness on the same figure:

plot_resistance(ecdc2, "Escherichia coli", "Third-generation cephalosporins", "Netherlands")
par(new = TRUE)
plot_fitness(ecdc, "Escherichia coli", "Third-generation cephalosporins", "Netherlands", c(-.1, .1))

Let’s include the above code in the following function:

plot_res_fit <- function(df, bug, drug, ylim = NULL, interval = c(-.1, .1)) {
  opar <- par(mfrow = c(10, 3), plt = c(.13, .88, .2, .9))
  for (country in sort(unique(ecdc$country))) {
    plot_resistance(df, bug, drug, country, ylim, xlim = c(2000, 2017))
    par(new = TRUE)
    plot_fitness(df, bug, drug, country, interval)
    mtext(country)
  }
  par(opar)
}

Let’s now apply this function for a number of situations

Third-generation cephalosporins resistance in Escherichia coli

plot_res_fit(ecdc2, "Escherichia coli", "Third-generation cephalosporins")

Third-generation cephalosporins resistance in Klebsiella pneumoniae

plot_res_fit(ecdc2, "Klebsiella pneumoniae", "Third-generation cephalosporins")

Carbapenems resistance in Klebsiella pneumoniae

plot_res_fit(ecdc2, "Klebsiella pneumoniae", "Carbapenems")

Carbapenems resistance in Pseudomonas aeruginosa

plot_res_fit(ecdc2, "Pseudomonas aeruginosa", "Carbapenems")

Ceftazidime resistance in Pseudomonas aeruginosa

plot_res_fit(ecdc2, "Pseudomonas aeruginosa", "Ceftazidime")

MRSA

plot_res_fit(ecdc2, "Staphylococcus aureus", "Meticillin (MRSA)")

A binomial model

The following binomial model is expected to produce exactly the same results (to be checked):

\[ R_{y+1} \sim \mbox{Bin}\left(N_{y+1}, \frac{R_y}{N_y} + \rho R_y \left(1 - \frac{R_y}{N_y^2}\right)\right) \]